Study of Indexing Techniques to Improve the Performance of Information Retrieval in Telugu Language

نویسنده

  • Kolikipogu Ramakrishna
چکیده

Information Retrieval Systems (IRS) are so popular through World Wide Web. Availability of Text Information related to all types of objects like Documents, Web Pages, Images, Videos and Audio files on web are increasing day by day in an exponential manner. When the text repository grows to the maximum extent of the memory size in the server, the methods used to find a particular text unit either word or document is tedious task. Representation of these objects, using text information gives summarized features to decide whether to access the identified unit or not in the first look. Instead of exact query match in the document a set of keywords will be used to find the relevance of the document. If a set of keywords represents a document, then it is easy to match the couple of keywords from the query against keywords of the document and decide the relevance. Finding keywords to represent a complete unit is called index. Keyword are your own designated units which can be used for easy location of the document using any search engines. A keyword maps all the documents containing this indexed word. This problem is addressed by identifying indexed words or phrases of a document. Indexing terms together represents whole document and act as ambassadors of the unit. In this paper we studied the effect of various indexing techniques , namely , manual , automatic and semi-automatic on 10,000 Telugu text documents. Statistical Indexing is taken as base line approach and compared the results with other techniques. We observed that, the results are better plotted while moving from statistical representations to semantic representations. Keywords—Keywords, Indexing Terms, Manual Indexing, Automatic Indexing, Statistical Indexing, Semantic based Indexing, Telugu Text Corpus, N-gram, Inverted File Structure.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

وضعیت بازیابی اطلاعات در دو پایگاه نمایه و نما و سنجش اثربخشی استفاده از واژگان کنترل ‌شده در نمایه‌سازی این دو پایگاه

Purpose: This study was carried out to determine the level of precision, recall, and searching time for “Nama” and “Namayeh” databases, as well as to find out which of the indexing tools (thesaurus and Dewey decimal classification) helps us more in improvement of information retrieval. Methodology: This study is an analytical survey in which the necessary data was collected by direct observati...

متن کامل

An Approach for Improving Execution Performance in Inference Network Based Information Retrieval

The inference network retrieval model provides the ability to combine a variety of retrieval strategies expressed in a rich query language. While this power yields impressive retrieval effectiveness, it also presents barriers to the incorporation of traditional optimization techniques intended to improve the execution efficiency, or speed, of retrieval. The essence of these optimization techniq...

متن کامل

Bitmap Indexing-based Clustering and Retrieval of XML Documents

This paper describes a bitmap indexing based technique to cluster XML documents. XML documents can be hierarchically represented by elements. To improve performance of information retrieval, documents can be indexed using bitmap techniques. Such a bitmap index is sparse, meaning it contains unnecessarily many zero bits, especially for the word dimension. To remove zero bits and improve the perf...

متن کامل

Content Based Radiographic Images Indexing and Retrieval Using Pattern Orientation Histogram

Introduction: Content Based Image Retrieval (CBIR) is a method of image searching and retrieval in a  database. In medical applications, CBIR is a tool used by physicians to compare the previous and current  medical images associated with patients pathological conditions. As the volume of pictorial information  stored in medical image databases is in progress, efficient image indexing and retri...

متن کامل

بررسی تأثیر نمایه‌سازی مفهوم-محور تصاویر بر بازیابی آن‌ها با استفاده از موتور جستجوی گوگل

Purpose: The purpose of the present study is to investigate the Impact of Concept-based Image Indexing on Image Retrieval via Google. Due to the importance of images, this article focuses on the features taken into account by Google in retrieving the images. Methodology: The present study is a type of applied research, and the research method used in it comes from quasi-experimental and techno...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013